Vector and Matrix Calculus

1. Gradient
2. Divergence
- 2.1. Definition
  - 2.1.1. Orthogonal Curvilinear Coordinate System
- 2.2. Interpretation
3. Curl
- 3.1. Generalization
4. Laplacian
- 4.1. Definition
- 4.2. Properties
5. Jacobian
6. Hessian
- 6.1. Definition
- 6.2. Properties
7. Identities
8. Notations
9. Derivative
- 9.1. Leibniz rule
- 9.2. Of Inverse Matrix
10. Exponential
- 10.1. Properties
11. Jacobi's Formula
- 11.1. Formula
- 11.2. Properties
12. Reference

	Scalar Field	Vector Field
0th Derivative	\( f \)	\(\mathbf{f}\)
1st Derivative	\(\nabla f\) Gradient	\(J(\mathbf{f})\) Jacobian
		\(\supset \nabla\cdot\mathbf{f}\) Divergence and \(\nabla\times\mathbf{f}\) Curl
2nd Derivative	\(H(f)\) Hessian
	\(\supset\nabla^2f\) Laplacian

The \(\nabla\cdot\) and \(\nabla\times\) is the formal product. So the regular rule for dot product and cross product may not apply.
\(\nabla\) is not compatible with dot and cross product, that is, \( \mathbf{v}\cdot (\nabla \mathbf{w}) \neq (\mathbf{v}\cdot \nabla) \mathbf{w}. \)

1. Gradient

Vector field that represents the rate of change in a space.

1.1. Definition

For a morphism \(f\colon X\to Y\), the gradient \(\nabla f\colon X\to Z\) is a linear map, such that \[ dy = \langle \nabla f, dx\rangle \] in which bilinear map \(\langle \cdot, \cdot \rangle\colon Z\times X \to Y\) is well-defined.

1.1.1. Orthogonal Curvilinear Coordinate System

\[ \nabla f = \frac{1}{h_i}\frac{\partial f}{\partial x^i} \mathbf{e}_i \]
where \[ h_i = \left\| \frac{\partial \mathbf{r}}{\partial \tilde{x}^i}\right\|. \]

1.2. Properties

This can also be written in terms of differential form as: \[ dy = df(dx). \]

1.3. Formulae

Del in cylindrical and spherical coordinates - Wikipedia

2. Divergence

2.1. Definition

Divergence of a vector field \(\mathbf{F}\) is \[ \nabla\cdot \mathbf{F} = \frac{\partial F_{x_i}}{\partial {x_i}}. \]

2.1.1. Orthogonal Curvilinear Coordinate System

\[ \nabla\cdot \mathbf{F} = \frac{1}{\prod_j h_j}\left(\frac{\partial}{\partial x^i}\prod_{j\neq i}h_jF^i\right) \] where \[ h_i = \left\| \frac{\partial \mathbf{r}}{\partial \tilde{x}^i}\right\|. \]

2.2. Interpretation

The net flux through a unit volume.
The rate of change of the ratio of volume (the ratio of the rate of change in volume, rate of change of a unit volume) subjected to the flow of a vector field.
- For a vector field given by a linear transformation:
  - \[ \nabla\cdot(\mathbf{Ax}) = \frac{d}{dt}\ln V \]
- The infinitesimal transformation generated by the vector field \(\mathbf{F}\) is: \[ \tilde{x}^i = x^i + F^idt \]
  - The Jacobian of the transformation would be: \[ J_i^j = \begin{bmatrix} 1 + \partial_{x^1}F^1dt & \partial_{x^2}F^1dt & \cdots & \partial_{x^n}F^1dt \\ \partial_{x^1}F^2dt & 1 + \partial_{x^2}F^2dt & \cdots & \partial_{x^n}F^1dt \\ \vdots & \vdots & \ddots & \vdots \\ \partial_{x^1}F^ndt & \partial_{x^2}F^ndt & \cdots & 1+ \partial_{x^n}F^ndt \\ \end{bmatrix} \]
  - And the determinant is: \[ \det J_i^j = 1 + \nabla\cdot \mathbf{F}\,dt + O(dt^2) \]
  - By taking the derivative of that: \[ \frac{d}{dt} \det J_i^j = \nabla\cdot \mathbf{F} \]

3. Curl

3.1. Generalization

3.1.1. Orthogonal Curvilinear Coordinate System

\[ \nabla\times \mathbf{F} = \frac{1}{h_1h_2h_3}\begin{vmatrix} h_1\tilde{\mathbf{e}}_1 & h_2\tilde{\mathbf{e}}_2 & h_3\tilde{\mathbf{e}}_3 \\[.5em] \dfrac{\partial}{\partial \tilde{x}^1} & \dfrac{\partial}{\partial \tilde{x}^2} & \dfrac{\partial}{\partial \tilde{x}^3} \\[1em] h_1\tilde{F}^1 & h_2\tilde{F}^2 & h_3\tilde{F}^3 \\ \end{vmatrix} \]
where \[ h_i = \left\| \frac{\partial \mathbf{r}}{\partial \tilde{x}^i}\right\|. \]

3.1.2. General Coordinate System

\[ (\nabla \times \mathbf{F} )^k = \frac{1}{\sqrt{g}} \varepsilon^{k\ell m} (\nabla_\ell \mathbf{F})_m \]
- using the covariant derivative.
By the symmetry of the Christoffel symbols , \[ (\nabla \times \mathbf{F} ) = \frac{1}{\sqrt{g}} \mathbf{e}_k\varepsilon^{k\ell m} \partial_\ell F_m \]

3.1.3. Differential Form

\[ \left(\star(\mathrm{d}\mathbf{F}^\flat)\right)^\sharp \]
where \(\flat\) and \(\sharp\) are the musical isomorphisms that takes the basis vectors into corresponding basis 1-forms.

4. Laplacian

4.1. Definition

\[ \nabla^{\cdot 2} f = \nabla\cdot\nabla f \]
\(\nabla^2\) is used in physics, and \(\Delta\) is used in mathematics.

4.2. Properties

Divergence of Gradient
Trace of the Hessian .

5. Jacobian

Transformation between curvilinear coordinate systems.

5.1. Definition

A Jacobian matrix of a vector field \(\mathbf{f}\) is \[ J^{i}{}_{j}=\frac{\partial f^i}{\partial x^j} \] where \(i\) is the row number and \(j\) is the column number.

It tells the rate of change in the vector field in any direction. Consider the identity: \( \mathrm{d}f^i=J_{j}^{i}\mathrm{d}x^j \) or equivalently, \( \mathrm{d}\mathbf{f}=\mathbf{J}\mathrm{d}\mathbf{x} \).

Beware that some people prefer to use the transpose of this Jacobian as their Jacobian.

5.2. Inverse

\[ J^{-1}{}^i{}_j := \frac{\partial x^i}{\partial f^j} \] The inverse matrix can also be written concisely as \[ J^{-1}{}^i{}_j = J_j{}^i. \]

Reciprocate each element and transpose the Jacobian matrix.

5.3. Change of Basis

A Jacobian of coordinate transformation from coordinates \(x^j\) to coordinates \(\tilde{x}^i\) is \[ J^i{}_j=\frac{\partial \tilde{x}^i}{\partial x^j} \] which transforms the components.

To transform the basis, the inverse Jacobian is used. \[ \frac{\partial}{\partial \tilde{x}^j}=J_j{}^i\frac{\partial}{\partial x^i} \] equivalently, \[ \begin{bmatrix}\tilde{\mathbf{e}}_{1}&\tilde{\mathbf{e}}_{2}&\cdots&\tilde{\mathbf{e}}_{n}\end{bmatrix}=\begin{bmatrix}\mathbf{e}_{1}&\mathbf{e}_{2}&\cdots&\mathbf{e}_{n}\end{bmatrix}\mathbf{J}^{-1}. \]

\(\mathbf{J} : TM \to TN\)	\(TM \to TN\)	\(TN \to TM\)
Covariant	\(\mathbf{J}^{-1}\)	\(\mathbf{J}\)
Contravariant	\(\mathbf{J}\)	\(\mathbf{J}^{-1}\)

5.4. Determinant

The determinant of the Jacobian is the ratio of volumes due to transformation. Thus used as the factor in the change of the measure of an integral.

6. Hessian

6.1. Definition

Hessian \(\mathbf{H}\) of a twice-differentiable scalar field \(f\) is: \[ H_{ij} = \frac{\partial^2 f}{\partial x^i\partial x^j}. \]

6.2. Properties

Hessian matrix is the transpose of the Jacobian matrix of the gradient.
- excalidraw:./hessian.excalidraw
- \((\mathrm{d}\mathbf{x})^{\rm T}\mathbf{H}[f]\mathrm{d}\mathbf{x} = (\mathrm{d}\nabla f)^{\rm T}\mathrm{d}\mathbf{x}.\)
- If it is evaluated at a stationary point, then \(\mathrm{d}\nabla f\) would point in the direction of the gradient \(\nabla f\).
- Notice that \(\nabla f\) is the normal map, namely, a Gauss map.
If the Hessian is positive-definite at \(\mathbf{x}\), then \(f\) attains an isolated local mimimum at \(\mathbf{x}\), by the same note, if the Hessian is negative-definite, then \(f\) attains an isolated local maximum.

7. Identities

Vector calculus identities - Wikipedia

8. Notations

There exists two main notational convention in taking derivative with respect to a vector or a matrix: numerator layout convention and denominator layout convention. They have their own advantages and disadvantages, and some even mix and match them. It is generally recommended to follow the layout of the textbook presented.

The numerator layout treats the vector in the numerator as a column vector, and the vector in the denominator as a row vector. For example, \[ \frac{\partial \mathbf{y}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \frac{\partial y_1}{\partial x_2} & \cdots & \frac{\partial y_1}{\partial x_n} \\ \frac{\partial y_2}{\partial x_1} & \frac{\partial y_2}{\partial x_2} & \cdots & \frac{\partial y_2}{\partial x_n} \\ \vdots & \vdots &\ddots & \vdots \\ \frac{\partial y_n}{\partial x_1} & \frac{\partial y_n}{\partial x_2} & \cdots & \frac{\partial y_n}{\partial x_n} \\ \end{bmatrix}. \] which matches the layout of the standard Jacobian.

Similarly, the denominator layout treats the vector in the numerator as a row vector, and the vector in the numerator as a column vector. For example, \[ \frac{\partial f}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial f}{\partial x_1} \\ \frac{\partial f}{\partial x_2} \\ \vdots \\ \frac{\partial f}{\partial x_n} \\ \end{bmatrix} \] which matches the layout of the standard gradient.

A matrix can be used in either the numerator or denominator, but not both. When a matrix in in the denominator, it is treated as the transpose of itself. In these matrix calculus notation, tensors whose ranks are higher than 2 is not the subject of interest.

This notation is just for convenience. See Matrix calculus - Wikipedia for more.

9. Derivative

9.1. Leibniz rule

\[ \frac{d}{dx}(\mathbf{A}\mathbf{B}) = \frac{d\mathbf{A}}{dx}\mathbf{B} + \mathbf{A}\frac{d\mathbf{B}}{dx} \]

9.2. Of Inverse Matrix

\[\frac{d\mathbf{A}^{-1}}{dx}=-\mathbf{A}^{-1}\frac{d\mathbf{A}}{dx}\mathbf{A}^{-1}\]

10. Exponential

\[ e^{\mathbf{A}} := \sum_{n=0}^\infty \frac{\mathbf{A}^n}{n!}. \]

10.1. Properties

\[ \mathbf{A}\mathbf{B} = \mathbf{B}\mathbf{A} \iff e^{\mathbf{A}}e^\mathbf{B} = e^{\mathbf{A}+\mathbf{B}} \]
\[ e^\mathbf{O} = \mathbf{I},\quad \left(e^\mathbf{A}\right)^{-1} = e^{-\mathbf{A}},\quad \left(e^\mathbf{A}\right)^n = e^{n\mathbf{A}} \]
\[ \left(e^{\mathbf{A}}\right)^{\mathrm T} = e^{\mathbf{A}^\mathrm{T}}, \quad \operatorname{det}\left(e^\mathbf{A}\right) = e^{\operatorname{tr}(\mathbf{A})} \]
If \(\mathbf{A}\) is diagonalizable: \[ e^{\mathbf{A}} = \mathbf{V}e^{\mathbf{\Lambda}}\mathbf{V}^{-1}. \]
The solution to the differential equation: \[ \mathbf{y}' = \mathbf{A}\mathbf{y} \] is the matrix exponential: \[ e^{\mathbf{A}t}\mathbf{y}_0 \] for any square matrix \(\mathbf{A}\).

11. Jacobi's Formula

Complementary to the Liouville's formula.

11.1. Formula

\[ \frac{d}{dt}\det \mathbf{A}(t) = \operatorname{tr}\left(\operatorname{adj}(\mathbf{A}(t))\frac{d\mathbf{A}(t)}{dt}\right) \] where \(\operatorname{adj}\) is the adjugate matrix.
If \(\mathbf{A}\) is invertible, it can further be said to be \[ \frac{d}{dt}\det\mathbf{A} = \det(\mathbf{A}(t)) \operatorname{tr}\left(\mathbf{A}^{-1}(t)\frac{d}{dt}\mathbf{A}(t)\right) \]

11.2. Properties

This means
- \[ \frac{\partial \det\mathbf{A}}{\partial A_{ij}} = (\operatorname{adj}\mathbf{A})_{ji} = (\mathbf{C})_{ij}, \]
  - where \(\mathbf{C}\) is the cofactor matrix;
- \[ d\det(\mathbf{A}) = \operatorname{tr}(\operatorname{adj}(\mathbf{A})\,d\mathbf{A}) = \langle (\operatorname{adj}\mathbf{A})^{\rm T}, d\mathbf{A}\rangle_{\rm F}, \]
  - where \(\langle \cdot,\cdot\rangle\) is the Frobenius inner product;
- \[ \nabla \operatorname{det}(\mathbf{A}) = (\operatorname{adj}\mathbf{A})^{\rm T} = \mathbf{C}, \]
  - where \(\nabla\) is the gradient.